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Abstract 

Distributed functional scalar quantization (DFSQ) theory provides optimality conditions and predicts performance of data 
acquisition systems in which a computation on acquired data is desired. We address two limitations of previous works: prohibitively 
expensive decoder design and a restriction to sources with bounded distributions. We rigorously show that a much simpler decoder 
has equivalent asymptotic performance as the conditional expectation estimator previously explored, thus reducing decoder design 
complexity. The simpler decoder has the feature of decoupled communication and computation blocks. Moreover, we extend the 
DFSQ framework with the simpler decoder to acquire sources with infinite-support distributions such as Gaussian or exponential 
distributions. Finally, through simulation results we demonstrate that performance at moderate coding rates is well predicted by 
(— | , the asymptotic analysis, and we give new insight on the rate of convergence. 
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I. Introduction 



FUNCTIONAL source coding techniques are of great importance in modern distributed systems such as sensor networks and 
cloud computing architectures because the fidelity of acquired data can greatly impact the accuracy of computations made 
with that data. In this work, we provide theoretical and empirical results for quantization in distributed systems described by 
^ . the topology in Fig. Q] Here, N memoryless sources produce scalar realizations Xf = (X\, . . . , Xjy) from a joint distribution 
O [ fx n at each discrete time instant. These measurements are compressed via separate encoders and then sent to a central decoder 
that approximates a computation on the original data; the computation may be the identity function, meaning that the acquired 
samples themselves are to be reproduced. 

There has been substantial effort to study distributed coding using information-theoretic concepts, taking advantage of large 
On ■ block lengths and powerful decoders to approach fundamental limits of compression. However, techniques inspired by this 
theory are infeasible for most applications. In particular, strong dependencies between source variables imply low information 
content per variable, but exploiting this is difficult under rigid latency requirements. 

Rather than have long blocks, the complementary asymptotic of high-resolution quantization theory JT] is more useful for 
these scenarios; most of this theory is focused on the scalar case, where the block length is one. The principal previous work in 
. applying high-resolution quantization theory to the acquisition and computation network of Fig. Q] is the distributed functional 
scalar quantization (DFSQ) framework (5J. The key message from DFSQ is that the design of optimal encoders for systems 
that perform nonlinear computations can be drastically different from what traditional quantization theory suggests. In recent 
years, ideas from DFSQ have been applied to compressed sensing compression for media JU, and channel state feedback 
in wireless networks 0. 

Like the information-theoretic approaches, the existing DFSQ theory relies in principle on a complicated decoder. (This 
reviewed in Section lTl-CI ) The primary contribution of this paper is to study a DFSQ framework that employs a simpler decoder. 
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Fig. 1 . A distributed computation network, where each of N spatially-separated sources generate a scalar X„ . The scalars are encoded and communicated over 
rate-limited links to a central decoder without interaction between encoders. The decoder computes an estimate of the function g(X™) = g(Xi, X2 , • . • , X n ) 
from the received data using g(X"). Each encoder is allowed transmission rate R n . 
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Fig. 2. A block diagram for companding as a constructive method for non-uniform scalar quantization. The notation Qu.K is used to describe the canonical 
uniform quantizer with K partitions in the granular region [0, 1]. In this paper, only the partition boundaries are scaled using compressor function c; the 
codewords are defined through midpoint reconstruction (0 and can be computed at the decoder. 



Remarkably, the same asymptotic performance is obtained with the simpler decoder, so the optimization of quantizer point 
density is unchanged. Furthermore, the simplified framework allows a greater decoupling or modularity between communication 
(source encoding/decoding) and computation aspects of the network. 

The analysis presented here uses different assumptions on the source distributions and function than flZ) — neither is uniformly 
more or less restrictive. Unlike in j2], we are able to allow the source variables to have infinite support. In fact, the functional 
setting allows us to present high-resolution quantization results for certain heavy-tailed source distributions for the first time. 

We begin in Sec. [n]by reviewing relevant previous work and summarizing the contributions of this paper. In Sec. iDTlandlTVl 
we give distortion and design results for a distributed network. Finally, we provide examples for the theory in Sec. |V] and 
conclude in Sec. [Vl] 



II. Preliminaries 

A. Previous Work 

The distributed network shown in Fig. [TJis of great interest to the information theory and communications communities, and 
there exists a variety of results corresponding to different scenarios of interest. We present a short overview of some major 
works; a comprehensive review appears in @. 

In the large block length asymptotic, there are many influential and conclusive results. For the case of discrete-valued sources 
and g{Xi) — , the lossless distributed source coding problem is solved by Slepian and Wolf @. In the lossy case, the 
problem is generally open except in specific situations Q, 0. The case where g(X^) = X\ and the rate is unconstrained except 
for i?i is the well-known source coding with side information problem J9)- For more general computations, the lossless ifTOl - 
llT2l and lossy iTPJl . |[T4l cases have both been explored. 

There are also results for when the block length is constrained to be very small. We will delay discussion of DFSQ for 
later and instead focus on related works. The use of high-resolution for computation has been considered in detection and 
estimation problems llT5l - |[T7l . In the scalar setting, the scenario where the computation is unknown but is drawn from a set 
of possibilities has been studied 0~8)- Finally, there are strong connections between DFSQ and multidimensional companding, 
a technique used in perceptual coding |fl9l . 



B. High- resolution Scalar Quantizer Design 

A scalar quantizer Qk is a mapping from the real line to a set of K points C = {eft}?, C R called the codebook, where 
Qk{x) = Cfc if x G Pk and the cells {Pk\k=i f° rm a partition of R. The quantizer is called regular if the partition cells are 
intervals containing the corresponding codewords. We then assume the codebook entries are indexed from smallest to largest and 
that Pk = (pfe-i , Pk] for eacn k; this is essentially without loss of generality because the dispositions of the endpoints of the cells 
are immaterial to performance when the quantizer input is continuous. Regularity implies po < c\ < p\ < c-i < ■ ■ ■ < ck < Pk, 
with po = —00 and px = 00. Define the granular region as (ci, Ck) and its complement (—00, c\\ U [ck, 00) as the overload 
region. 

Uniform (linear) quantization, where partition cells in the granular region have equal length, is most commonly used in 
practice, but other quantizer designs are possible. Fig. |2]presents the compander model as a method for generating nonuniform 
quantizers from a uniform one. In this model, the scalar source is transformed using a nondecreasing and smooth compressor 
function c : M — > [0, 1], then quantized using a uniform quantizer comprising K levels on the granular region [0, 1], and finally 
passed through the expander function c _1 . Compressor functions are defined such that lim T ^_oo c(x) = and lim^^oo c{x) = 
1. It is convenient to define a point density function as X(x) = c'(x). Because of the extremal conditions on c, there is a 
one-to-one correspondence between A and c, and hence a quantizer of the form shown in Fig. |2] can be uniquely specified 
using a point density function and codebook size. We denote such a quantizer as Qk.x- By virtue of this definition, the integral 
of the point density function over any quantizer interval is 1/K: 

rPk+i 1 

/ \{x)dx = —, k = l, 2, ...,K. (1) 

In practice, scalar quantization is rarely, if ever, performed by an explicit companding operation. A slight modification that 
avoids repeated computation of c _1 is to apply the compressor c, compare to threshold values (multiples of 1/K) to determine 
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the partition cell and then obtain c& from a pre-computed table. We assume that the non-extremal reconstruction values 
are set to the midpoints of the cells, i.e. 

c k = Pfc ~ 1 2 +Pfc , k = 2, 3, . . . , K - 1. (2) 

This is suboptimal relative to centroid reconstruction, but has the simplicity of depending only on A and K — not on the source 
density. The extremal reconstruction values are fixed to be c\ = p\ and ck = Pk-i- This again is suboptimal but does not 
depend on the source distribution. We will show later that this construction does not affect asymptotic quantizer performance. 

The utility of the compander model is that we can precisely analyze the distortion behavior as K becomes large and use 
this to optimize A. Assuming the source is well-modeled as being drawn iid from a probabilistic distribution, we define the 
mean-squared error (MSE) distortion as 

AnseC^, A) =E[\X- Q KA (X)\% (3) 

where the expectation is with respect to the source density fx- Under the additional assumption that fx is continuous (or 
simply measurable) with tails that decay sufficiently fast, 

D msc (K,\)^^E[\- 2 (X)l (4) 

where ~ indicates that the ratio of the two expressions approaches 1 as K increases ||20l , |2"TI . Hence, the MSE performance 
of a scalar quantizer can be approximated by a simple relationship between the source distribution, point density and codebook 
size, and this relation becomes more precise with increasing K. Moreover, quantizers designed according to this approximation 
are asymptotically optimal, meaning that the quantizer optimized over A has distortion that approaches the performance of the 
best Qk found by any means l22l - ll24l . meaning 

MV[\X-Q K (X)\ 2 ] ~_L_E[A- 2 P0]. (5) 

Experimentally, the approximation is accurate even for moderate K U, l25l . Since distortion depends only on A in the 
asymptote, calculus techniques can be used to optimize companders. 

When the quantized values are to be communicated or stored, it is natural to map codewords to a string of bits and consider 
the trade-off between performance and communication rate R, defined to be the expected number of bits per sample. In the 
simplest case, the codewords are indexed and the communication rate is R = log 2 (A'); this is called fixed- rate or codebook- 
constrained quantization. Holder's inequality can be used to show that the optimal point density for fixed-rate is 

Amse.frO) K /x 3 0)> ( 6 ) 



^mse,frCR)^7dl/*Hl/3 2- 2fl , (7) 



and the resulting distortion is 

12 

with the notation ||/|| p = (j^ p{x) dx) 1 ^ |26l . 

In general, the codeword indices can be coded to produce bit strings of different lengths based on probabilities of occurrence; 
this is referred to as variable-rate quantization. If the decoding latency is allowed to be large, one can employ block entropy 
coding and the communication rate approaches H(Qk,x{X)). This particular scenario, called entropy-constrained quantization, 
can be analyzed using Jensen's inequality to show the optimal point density A* nse ec is constant on the support of the input 
distribution 11261 . The optimal quantizer is uniform and the resulting distortion is 

D* msc , cc (R)c,±2-^-^l (8) 

Note that block entropy coding suggests that the sources are transmitted in blocks even though the quantization is scalar. As 
such, ([8]) is an asymptotic result and serves as a lower bound on practical entropy coders with finite block lengths that match 
the latency restrictions of a system. 

In general, the optimal entropy-constrained quantizer (at a finite rate) for a distribution with unbounded support can have an 
infinite number of codewords j27). The compander model used in this paper cannot generate all such quantizers. A common 
alternative is to allow the codomain of c to be M rather than [0, 1], resulting in a point density that cannot be normalized 11281 . 
l29l . To avoid parallel developments for normalized and unnormalized point densities, we restrict our attention to quantizers 
that have a finite number of codewords K at any finite rate R. This may preclude exact optimality, but it does not change the 
asymptotic behavior as K and R increase without bound. Specifically, the contribution to overall distortion from the overload 
region is made negligible as K and R increase, so the distinction between having finitely- or infinitely-many codewords 
becomes unimportant. 
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C. Functional Scalar Quantizer Design 

In a distributed network where the encoders employ scalar quantization and the decoder performs a known computation, 
optimizing for the computation rather than source fidelity can lead to substantial gains. In 13, distortion performance and 
quantizer design are discussed for the distributed setting shown in Fig. Q] with g a scalar-valued function. For DFSQ, the cost 
of interest is functional MSE (fMSE): 



D 



fmsc ^ 



(K? , Af ) = E \g(X» ) - g{Q K » x » {X?))\ 



where g is a scalar function of interest, g is the optimal fMMSE estimator 



g( x i ) I Qk?,\?( x i ) = Qk*,\?( x i ) 



g(x?) = E 

and Q k n x n is scalar quantization performed on a vector such that 

Qk?,\?{Xi) = (QAi,JTi(ix),---<9a^ 



(9) 



(10) 



■ K, 



,{x N )) 



Note the complexity of computing g: it requires integrating over an ^-dimensional partition cell with knowledge of the joint 
source density f x «- Later in this paper, we avoid this complexity by setting g to equal g. 

Before understanding how a quantizer affects fMSE, it is convenient to define how a computation locally affects distortion. 

Definition 1. The univariate functional sensitivity profile of a function g is defined as 

7(x) = \g'{x)\. 

The nth functional sensitivity profile of a multivariate function g is defined as 

ln {x) = (E[\g n (X?)\ 2 \X n = x]) 1/2 , (11) 
where g n (x) is the partial derivative of g with respect to its nth argument evaluated at the point x. 
Given the sensitivity profile, the main result of J2) says 



N 1 

71 = 1 ™ 



E 



ln{X n ) 
X n (X n ) 



(12) 



provided the following conditions are satisfied: 

MF1. The function g is Lipschitz continuous and twice differentiable in every argument except possibly on a set of Jordan 
measure 0. 

MF2. The source pdf f X N is continuous, bounded, and supported on [0, 1]^. 

MF3. The function g and point densities A n allow E[(-f n (X n ) / X n (X n )) 2 ) to be defined and finite for all n. 

Following the same recipes to optimize over Af , the relationship between distortion and communication rate is found. In 
both cases, the sensitivity acts to shift quantization points to where they can reduce the distortion in the computation. For fixed 
rate, the minimum high-resolution distortion is achieved by 



A., 



fmsc 



fr(x) OC (7„(x)/ X „(x)) 1/3 



where fx n is the marginal distribution of X n . In the entropy-constrained case, the optimizing point density is 

Ks msc , cc {x) oc 7n(x). 

Notice unnormalized point densities are not required here since the source is assumed to have bounded support. 



(13) 
(14) 



D. Main Contributions of Paper 

The central goal of this paper is to develop a more practical method upon the theoretical foundations of 0. In particular, 
we provide new insight on how a simplified decoder can be used in lieu of the optimal one in ( TTOb . Although the conditional 
expectations are offline computations, they may be extremely difficult and are computationally infeasible for large N and K. 
We consider the case when the decoder is restricted to applying the function g explicitly on the quantized measurements. To 
accommodate this change and provide more intuitive proofs, a slightly different set of conditions is required of g, , and 
fx* ■ 

Additionally, we generalize the theory to infinite-support source variables and vector-valued computations. In brief, we derive 
new conditions on the tail of the source density and computation that allow the distortion to be stably computed. Interestingly, 
this extends the class of probability densities under which high-resolution analysis techniques have been successfully applied. 
The generalization to vector-valued g is a more straightforward extension that is included for completeness. We present several 
examples to illustrate the framework and the convergence to the asymptotics developed in this work. 
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III. Univariate Functional Quantization 

We first discuss the quantization of a scalar random variable X by Qk,\, with the result fed into a function g in order to 
approximate g(X). As mentioned, this is a simpler decoder than analyzed in @. We find the dependence of fMSE on A and 
then optimize with respect to A to minimize fMSE. 

Assume a companding quantizer with point density A and granular region Sk C M Consider the following conditions on 
the computation g and the density fx of the source: 

UF1'. The source pdf fx is continuous and strictly positive on Sk for any finite K. 

UF2'. The function g is continuous on Sk with both \g'\ and \g"\ defined and bounded by a finite constant C u . 
UF3'. fx(x)\g'(x)\ 2 ~ m /X 2+rn (x) is Riemann integrable over S K for m = 0, 1, 2. 
UF4'. fx, g and A satisfy the tail condition 

,. r \9{x) -g(y)\ 2 fx(x)dx 

lrm — 3 = 0, 

(.i;A(,; ( /,) 

and the corresponding condition for y — > — oo. 

The main result of this section is on the fMSE induced by a quantizer Qk.x under these conditions: 

Theorem 1. Assume fx, g, and A satisfy conditions UF1'—UF4'. Then the fMSE 

D lmsc (K, A) = E [\g(X) - g(Qx,A W)| 2 ] (15) 

has the following limit: 

Proof: See Appendix lAl 



(16) 



A. Remarks 

1. The fMSE in ( TToT ) is the same as in ( fl2b . We emphasize that the theorem shows that this fMSE is obtained by simply 
applying g to the quantized variables rather than using the optimal decoder g from ( TXcTb . Further analysis on this point is given 
in Sec. ITlFCl 

2. When g is monotonic, the performance (TToT i is as good as quantizing and communicating g(X) J21 Lemma 5]. Otherwise, 
the use of a regular quantizer results in a distortion penalty, as illustrated in Example Q] 

3. One key contribution of this theorem is the additional tail condition for infinite-support source densities, which effectively 
limits the distortion contribution in the overload region. This generalizes the class of probability densities for which distortion 
can be stably bounded using high-resolution approximations l22l - lF24l . We will demonstrate this with quantization of a Cauchy- 
distributed scalar in Example [2] 

4. For linear computations, the sensitivity is flat, meaning the optimal quantizer is the same as in the MSE-optimized case. 
Hence, functional theory will lead to new quantizer designs only when the computation is nonlinear. 

5. In the proof of TheoremQ] the first mean-value theorem is used on both fx and A, implying these densities are continuous. 
However, this requirement can be loosened to piecewise-continuous distributions provided the tail conditions still hold and a 
minor adjustment is made on how partition boundaries are chosen 11231 . Rather than elaborating further, we refer the reader to 
a similar extension in |2] Sec. III-F] . An equivalent argument can also be made for g having a finite number of discontinuities 
in its first and second derivatives. 

6. The theorem assumes that g is continuous and differentiable on the granular region, meaning the sensitivity is positive. 
However, for explicit regions where g'(x) = 0, the use of "don't care" regions can be used to relax these conditions |f2] Sec. 
VII]. 



B. Asymptotically Optimal Quantizer Sequences 

Since the fMSE of Theorem Q] matches (fT2l . the optimizing quantizers are the same. Using the recipe of Sec. IH-BI we can 
show the optimal point density for fixed-rate quantization is 

Afmsr fr(*^) = ~ It; ) (17) 

• imt)f x (t)) l/3 dt 



with distortion 



DLseA R ) ~ ^ll7 2 /x||i/ 3 2- 2ii . (18) 
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Fig. 3. (a) Codeword placement under simple, MMSE, and fMMSE decoders. The simple decoder performs midpoint reconstruction followed by the 
application of the computation g. The MMSE decoder applies g to the conditional expectation of X within the cell. Finally, the fMMSE decoder determines 
UOt for the cell. In this example, the source distribution is exponential and the computation is concave, (b) Performance loss due to the suboptimal codeword 
placement with respect to rate. We can see that relative excess fMSE decreases linearly with rate and hence the fMSE of the resulting quantizers are 
asymptotically equivalent. 



^fmse,ec( X ) — TT77T37' (19) 



Meanwhile, optimization in the entropy-constrained case yields 

J 7 (t)dt 
giving distortion 

D? mc {R) ~ 1 2 2h(X) + 2Ello gl (X)] 2 -2* (2Q) 

C. Negligible Suboptimality of Simple Decoder 

Recall the simple decoder analyzed in this work is the computation g applied to midpoint reconstruction as formulated in 
(0. One can do better by applying g after finding the conditional MMSE estimate of X utilizing knowledge of the source 
distribution only, or the fMMSE estimator (TTOb incorporating the function as well. The codeword placements of the three 
decoders are visualized through an example in Fig. |3]a). The asymptotic match of the performance of the simple decoder to 
the optimal estimator ( [Tol l is a main contribution of this paper. 

The simple decoder is suboptimal because it does not consider the source distribution at all, or equivalently assumes the 
distribution is uniform and the sensitivity is constant over the cell. High-resolution analysis typically approximates the source 
distribution as uniform over small cells |29l, and the proof of Theorem Q] utilizes the fact that the sensitivity is approximately 
flat over very small regions as well. Hence, the performance gap between the simple decoder and the fMMSE estimator 
becomes negligible in the high-resolution regime. 

To illuminate the rate of convergence, we study the performance gap as a function of quantization cell width, which is 
dependent on the communication rate (Fig. Ob)). We see the the relative excess fMSE (defined as (Ddcc — D op t) / D op t) is 
exponential in rate, meaning 

^p^ = l + Cl e-^ R (21) 

for some constants c\ and C2- The speed at which the performance gap shrinks contributes greatly to why the high-resolution 
theory is successful even at low communication rates. 

IV. Multivariate Functional Quantization 

We now describe the main result of the paper for the scenario shown in Fig.[T| where N random scalars (X\, . . . ,Xn) are 
individually quantized and a scalar computation g(X^) is performed. Assume the following conditions on the multivariate 
joint density, computation and quantizers over a granular region Sk C K n : 
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MFl'. The joint pdf J x n is continuous and always positive on Sk f° r an y finite K. 

MF2'. The multivariate function g is continuous and twice differentiable in every argument over Sk- Every first- and 

second-order derivative is uniformly bounded by a constant C m . 

MF3'. For any i,j <E {1, . . . , n} and m = 0, 1, 2, fx^Xj x j)l7 l ( x i)^7 1 ( x i)^J 1 ( x j) * s Ri emann integrable over Sk- 
MF4'. We adopt the notation £C\ n for x^ with the nth element removed; an inverse operator x(x n ,x\ n ) outputs a length- N 

vector with x n in the nth element. Then for every index n, the following holds for every x\ n : 



lim 

y— )-oo 



/ y °° \g( x ( x , x \ n )) - g( x (y,x\ n ))\ 2 fx»( x ( x > x \n))dx 



[J™ K{x)dx 



0. 



An analogous condition holds for the corresponding negative-valued tails. 

Recalling Q k n x n and A^ represent a set of N quantizers and point densities respectively, we present a theorem similar 
to Theorem [T] 

Theorem 2. Assume f X f>, g, and satisfy conditions MFl 1 —MF4' . Also assume a fractional allocation such that every 
a n > and ^2 n a n = 1, meaning a set of quantizers Q k n x n will have K n — a n n for some total allocation n. Then the 
fMSE 



Df mse (K 1 , A x ) 
of this distributed system has the following limit: 



E 



\g(xn-g(Q 



K?.\: 



W))l 



JV 



lim K 2 D lmse (K N , Af ) = V E 



An(^n) 



(22) 



(23) 



Proof: See Appendix iBl 



A. Remarks 

1 . Like in the univariate case, the simple decoder has performance that is asymptotically equivalent to the more complicated 
optimal decoder ( TTOb . 

2. Here, the computation cannot generally be performed before quantization because encoders are distributed. The exception 
is when the computation is separable, meaning it can be decomposed into a linear combination of computations on individual 
scalars. As a result, the sensitivity is no longer a conditional expectation and quantizer design simplifies to the univariate case, 
as demonstrated in Example [3] 

3. The strict requirements of MFl' and MF2' could potentially be loosened. However, simple modification of individual 
quantizers like in the univariate case is insufficient since discontinuities may lie on a manifold that is not aligned with the 
partition lattice of the iV-dimensional space. As a result, the error from using a planar approximation through Taylor's Theorem 
will be 0(1/ k), which is no longer negligible. However, based on experimental observations, such as in Example 4, we believe 
that when these discontinuities exist on a manifold of Jordan measure zero their error may be accounted for. Techniques similar 
to those in the proofs from J2] could potentially be useful in showing this rigorously. 



B. Asymptotically Optimal Quantizer Sequences 

As in the univariate case, the optimal quantizers match those in previous DFSQ work since the distortion equations are the 
same. Using Holder's inequality, the optimal point density for fixed-rate quantization for each source n (communicated with 
rate R n ) is 

A* (x) (jMM^l (24 ) 

with fMSE 

1 N 

AW«) ^EH^/*J|i/3 2- 2 *". (25) 

n=l 

Similarly, the best point density for the entropy-constrained case is 

'^n,fmsc.cc(' 1 ') = ~~Foo , , , l (26) 
Looln(t)dt 
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leading to a fMSE of 

1 N 

OLc,oc«) - ^ E 2^(^) + 2E [ lo g7 (X„) ] 2 -2H„. (27) 
n=l 

The rate allocations in d25l l and ( |27] i are allowed to vary. Given a total communication rate R, the optimal choice of is 
known l2l. l30l. 



C. Vector-valued Functions 

In Theorem [2] we assumed the computation g is scalar-valued. For completeness, we now consider vector-valued functions, 
where the output of g is a vector in M M . Here, the distortion measure is a weighted fMSE: 

M 



(28) 



m— 1 



\gW(Xi)-9 {m, (Q K »xH X ?)) 



where flf 4 is a set of scalar weights and g( m > is the mth entry of the output of g. Through a natural extension of the proof of 
Theorem[2] we can find the limit of the weighted fMSE assuming each entry of the vector- valued function satisfies MF1'-MF4'. 

Corollary 1. The weighted fMSE of a source f X N > computation g, set of scalar quantizers Q k n x n, and fractional allocation 



has the following limit: 



lim K 2 Amse(^f,Af ,P") 



N 

E 



12a2 



E 



where the sensitivity profile is 



/ M 

E 

\7n—l 



1/2 



firn E 



\g^\X?)\ 2 \X n = x 



(29) 



(30) 



V. Examples 

In this section, we present examples for both univariate and multivariate functional quantization using asymptotic expressions 
and empirical results from sequences of real quantizers. The empirical results are encouraging since the convergence to 
asymptotic limits is fast, usually when the quantizer rate is about 4 bits per source variable. This is because the Taylor 
remainder term in the distortion calculation decays with an extra k factor, which is exponential in the rate. 



A. Examples for Univariate Functional Quantization 

Below we present two examples of functional quantization in the univariate case. The theoretical results follow directly from 
Seclm] 

Example 1. Assume X ~ 7V(0, 1) and g(x) — x 2 , yielding a sensitivity profile j(x) = 2\x\, We consider uniform quantizers, 
optimal "ordinary" quantizers (quantizers optimized for distortion of the source variable rather than the computation) given in 
Sec. II1-BI and optimal functional quantizers given in Sec. IIII-B1 for a range of rates. The point densities of these quantizers, 
the source density fx, and computation g satisfy UF1'-UF4' and hence we utilize Theorem Q] to find asymptotic distortion 
performance. We also design practical quantizers for a range of R and find the empirical fMSE through Monte Carlo simulations 
using a random Gaussian source. In the fixed-rate case, theoretical and empirical performance are shown (Fig. |4). 

The distortion-minimizing uniform quantizer has a granular region that depends on R, which was explored in PP . Here, 
we simply perform a brute-force search to find the best granular region and the corresponding distortion. Surprisingly, this 
choice of the uniform quantizer performs better over moderate rate regions than the MSE-optimized quantizer. This is because 
the computation is less meaningful where the source density is most likely and the MSE-optimized quantizer places most 
of its codewords. Hence, one lesson from DFSQ is that using standard high-resolution theory may yield worse performance 
than a naive approach for some computations. Meanwhile, the functional quantizer optimizes for the computation and gives an 
additional 3 dB gain over the optimal ordinary quantizer. There is still a loss in using regular quantizers due to the computation 
being non-mono tonic. In fact, if the computation can be performed prior to quantization, we gain an extra bit for encoding the 
magnitude and thus 6 dB of performance. This illustrates Remark 2 of Sec. IIII-AI 

In the fixed-rate case, the empirical performance approaches the distortion limit described by Theorem Q] The convergence 
is fast and the asymptotic results predict practical quantizer performance at rates as low as 4 bits/sample. 

Example 2. Let a source X be distributed according to the Cauchy distribution centered around 0. This heavy-tail density 
is special in that the mean and all higher moments are not defined. Hence, it does not satisfy the conditions needed for 
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Fig. 4. Empirical and theoretical performance for the uniform, ordinary and functional quantizers, as well as the case when the computation is performed 
first. The distribution is standard normal and g(x) = x 2 . The distortions are multiplied by 2 2R to better indicate the convergence results. 
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Fig. 5. Empirical and theoretical performance for the uniform, ordinary and functional quantizers for Cauchy fx and g(x) 
multiplied by 2 2R to better indicate the convergence results. 



e I 31 '. The distortions are 



high-resolution theory previously specified in H221 - H241 . However, the functional distortion can be asymptotically determined 
assuming Condition UF4' is satisfied. The computation g(x) = exp(— \x\) and the Cauchy density satisfy UF4', and we confirm 
that experimental results match the theoretical computation of asymptotic distortion in Fig. 



B. Examples for Multivariate Functional Quantization 

We next provide two examples that follow from the theory of Sec. [IV] 
Example 3. Let N sources be iid standard normal random variables and the computation be g(x^) 



Er 



Since the 



computation is separable, the sensitivity profile of each source is 7„(a;) = \x\, and the quantizers are the same as in Example Q] 
The distortion is also the same, except now scaled by N. 

Example 4. Let N sources be iid exponential with parameter A = 1 and the computation be g{x^ ) = min(a;^ v ). In this case, 
Condition MF2' is not satisfied since there exists N(N — l)/2 two-dimensional planes where the derivative is not defined. 
However, as discussed in the remarks of Theorem |2] we strongly suspect we can disregard the distortion contributions from 
these surfaces. The overall performance, ignoring the violation of condition MF2', may be analyzed using the sensitivity: 



ln{x) 



(E[| 5n (Xf)| 2 | 
(Pr{min(X 1 Ar ) 

(e -Ax)(JV-l)/2 s 



X n = x]) 
= X 1 \X 1 



1/2 



•}) 



1/2 



where the third line follows from the cdf of exponential random variables. 

In Fig. [6] we experimentally verify that the asymptotic predictions are precise. This serves as evidence that MF2' may be 
loosened. 



VI. Conclusions 

In this paper, we have extended distributed functional scalar quantization to a general class of finite- and infinite-support 
distributions, and demonstrated that a simple decoder, performing the computation directly on the quantized measurements, 
achieves asymptotically equivalent performance to the fMMSE decoder. Although there are some technical restrictions on the 
source distributions and computations to ensure the high-resolution approximations are legitimate, the main goal of the paper 
is to show that DFSQ theory is widely applicable to distributed acquisition systems without requiring a complicated decoder. 
Furthermore, the asymptotic results give good approximations for the performance at moderate quantization rates. 
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Fig. 6. Empirical and theoretical performance are provided for the uniform, ordinary and functional quantizers for N = 10, exponential fx, and g(x^) = 
min(a;^ v ). The distortions are multiplied by 2 2B - to better indicate the convergence results. 

DFSQ has immediate implications in how sensors in acquisition networks collect and compress data when the designer 
knows the computation to follow. Using both theory and examples, we demonstrate that knowledge of the computation may 
change the quantization mapping and improve fMSE. Because the setup is very general, there is potential for impact in areas 
of signal acquisition where quantization is traditionally considered as a black box. Examples include multi-modal imaging 
technologies such as 3D imaging and parallel MRI. This theory can also be useful in collecting information for applications 
in machine learning and data mining. In these fields, large amounts of data are collected but the measure of interest is usually 
some nonlinear, low-dimensional quantity. DFSQ provides insight on how data should be collected to provide more accurate 
results when the resources for acquiring and storing information are limited. 



Appendix A 
Proof of TheoremQ] 

Taylor's theorem states that a function g that is n + 1 times continuously differentiable on a closed interval [a, x] takes the 
form 

I n (i) ( \ \ 

g{x)=g(a)+ ^ — pfc-a)* J + R n (x,a), 



with a Taylor remainder term 



Rn(x, a) 



9 - [ Sl( x -a) n+1 



(31) 



(n + 1)! 

for some £ £ [a,x]. More specific to our framework, for any x € [ck,Pk), the first-order remainder is bounded as 

\Ri(x,c k )\ < - max \g"(0\(Pk-c k ) 2 . 

2 ?6[c fc ,Pfc] 

Using Condition UF2', we will uniformly bound |g"(£)| by C u . 

The first mean-value theorem for integrals states that for a continuous function r : [a,b] — > M and integrable function 
s : [a,b] — > [0, oo) that does not change sign, there exists a value x G [a, b] such that 

-6 /■& 

r(t)s(t) dt = r(x) / s(t) dt. (32) 



(33) 



For the case of the companding quantizers, combining this with (fTJ means 

K 



A(x) dx = X(y k )(pk+i - Pk) = A(j/fc)A fc 



for some s (p k ,Pk+i], where we have defined the kth quantizer cell length Afc = Pk+i — Pk- The relationship between K, 
A, and is central in the proof. 

With these preparations, we continue to the proof. Consider expansion of Df mse (K, A) by total expectation: 

K-l 



D fmsc (K,X) = J2 

k=0 j p* 



Pk+i 



\g(x) - g(c k )\ 2 fx(x)dx. 



(34) 



We would like to eliminate the first and last terms of the sum because the unbounded interval of integration would cause 
problems with the approximation technique employed later. The last term is 



IsO) - g{PK-i)\ 2 fx{x)dx, 



(35) 



Pk~i 
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where we have used ck = Pk-i- By Condition UF4', this is asymptotically negligible in comparison to 

2 



(£>>*) 



Thus the last term d35l l does not contribute to \\mK^,oo K 2 Df mse (K, A). We can similarly eliminate the first term, yielding 

K ~ 2 rPk+i 

K 2 D imsc {K,\) ~K 2 J2 / \g{x)-g(c k )\ 2 f x (x)dx. (36) 

k=l J Px 

Effectively, Condition UF4' promises that the tail of the source distribution is decaying fast enough that we can ignore the 
distortion contributions outside the extremal codewords. 

Further expansion of (|36*l l using Taylor's theorem with remainder yields: 

K 2 D fmsc (K,X) 



K 2 f PL+1 \g'(c k )(x-c k ) + R 1 {x : c k )\ 2 f x (x)dx 
fe=i J P" 

K ~ 2 rPk+i 

K 2 Y, / \g'{c k )\ 2 \x-c k \ 2 f x (x)dx 
fc=i J Pk 



K ~ 2 rPk+i 

K 2 2 / \Ri(x,c k )\ \g'(c k )\ \x-c K \fx(x)dx 



(A) 



fe=l J V* 



(B) 



K ~ 2 pPk+i 

K 2 Y / Ri(x,c k ) 2 f x (x)dx. 

k=l J PK 



(C) 

Of the three terms, only term (A) has a meaningful contribution. It can be simplified as follows: 

K — 2 



R2 H \g'(c k )\ 2 \x~c k \ 2 f x (x)dx 

k =l J P" 

( ^K 2 ]T W(c k )\ 2 fx(v k ) \x-c k \ 2 dx 



fc=l J P k 

^K 2 Y\g'(c k )\ 2 f x (v k )§ 

fe=l 

= y 2 £ fxivu) [-^-y) A k 
(4 1 f /V0) x 



12 J s \X{x) J fx{x)dx > (37) 

where (a) arises from using (l32t , where v k is some point in the kth quantizer cell; (b) is evaluation of the integral, recalling 
©; (c) follows from d33l : and (d) holds as K — > oo by the convergence of Riemann rectangles to the integral (assumption 
UF3'). 
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The higher-order error terms are negligible using the bound reviewed in d3TT >. We now show that term (B) goes to zero: 

K ~ 2 rPk+i 

K 2 Y, 2 \Ri(x,c k )\\g'(c k )\\x-c K \fx(x)dx 
k=i J P" 

(a) fPh+1 

< K 2 J2 C u Al\g'(c k )\ / \x-c k \f x (x)dx 

fe=l J Pk 

< K 2 Y, C u Ai\g'(c k )\f x (v k ) 



k =i 

K-2 



(c) G u ^ Jff'(efc)l A 

= -k g fxM A^y Afe 



0, (38) 

where (a) follows from bounding Ri(x, c k ) using (|3T1 i; (b) arises from using ((32) and bounding the integral; (c) follows from 
d33j; and (d) holds as K — > oo by the convergence of Riemann rectangles to the integral (assumption UF3')- Hence, the 
distortion contribution becomes negligible as K increases. 

A similar analysis can be used to show that expansion term (C) scales as 1 /K 2 with growing codebook size and is therefore 
also negligible. 

Appendix B 
Proof of Theorem[2] 

We parallel the proof of Theorem [T]using Taylor expansion and bounding the distortion contributions of each cell. We review 
the first-order version of the multivariate Taylor's theorem: a function that is twice continuously differentiable on a closed ball 
B takes the form 

TV 

g(x?) = g(a?) + ]T [<?„(af ){x n - a n )] + Ri(x?,a?), 

n=l 

where we recall that g n (x^) is the partial derivative of g with respect to the nth argument evaluated at the point x± . The 
remainder term is bounded by 

N N 
i=l j=l 

under Condition MF2'. Applying a linear approximation to a quantizer cell s with midpoint {c s )i and side lengths {Ai(s)}fL 1 , 
the Taylor residual is 



N N 

A . A . 



\Ri (xf,(c s )f)|<C m ^^A ? A 

t=l j=l 

, i N 
(a) x ^ 



i=l 
N 



NC m Aj2 



i=l 



'S^y.x, (p „ } -\ ,39, 

z— 1 

where in (a) we define A as the longest quantizer interval length in any dimension; in (b) we invoke d33l with y Sy i being the 
ith coordinate of some point in quantizer cell s; and in (c) we define a as the smallest o^. 

Let Sk be the partition lattice induced by N scalar quantizers, excluding the overload regions. By total expectation, we find 
the distortion of each partition cell and sum their contributions. By Condition MF4', the distortion from overload cells become 
negligible with increasing k and can be ignored. Using Taylor's theorem, the scaled total distortion becomes 

K 2 D {mse (K?,\?) = A + B + C, 
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where 



N N 



• (xj - c S)i )(xj - c S j)f x «{xi) dx N 



1 ) 



seS K Jx i &s n=l 



ses K - x i 



In term A, we may disregard all cross terms since (X n — Ck, n ) becomes uncorrected in the high-resolution approximation 
because the pdf in each cell becomes well-approximated by a uniform distribution as the cell gets smaller. The remaining 
components of the distortion are 



gl{{c a )?){x n -c s , n f)f x? {x^)dx^ 



Using d33l ). the distortion contribution becomes 



se5jf es \n=l 



((c a )f) 

^ 12^ A 2 (y ^ n) 



where y s ,„ is the nth coordinate of some point in quantizer cell s. Using assumption MF3', this approaches the integral 
expression 



N 1 



I2al 



E 



h ((c a )f ) 



It remains to show that the remainder terms B and C may be ignored. First, consider any of the N summands that constitute 
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D: 



E / 2 bn ((c s )f ) | \x n - C S) „| 
<K 2 Y, I %n((c)f)||x 



iVC m A ' 



aft 

2 — 1 

W 2NC m ~AK ^ , ^ 

AT AT 

•E A ^M _1 /xf ((of)n A i( s ) 

i=l i=l 



to 2iVC m A 7 



EEM(^)f) 



AT 



Ai^iJ-^Cy.,™)- 1 ^ ((rf) II A ^ s ) 

i=i 



0. 



lim 



2iVC m A 



AT 



i=l 



Ai(Xj)A„(X n ) 



(b) employs (l32~i and bounds x n — c s „| < A„; (c) invokes d33k and (d) is valid according to 



where (a) follows from 
assumption MF3'. 

Remainder term C is negligible in a similar manner, which proves the theorem. 
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